Estimating a Population Mean

HME Spr 2025

updated on 2025-02-05

Estimating a Population Mean: Eucalytpus trees

The Model of the Mean

The Model of the Mean

# here is our dataset

tree_diameter <- c(42,43,58,70,47,51,85,63,58,46)

The Model of the Mean

The Model of the Mean

The Model of the Mean

If I gave you a dataset and asked you for the mean and variation, what would you do?

Probably something like mean(data) and maybe var(data) or sd(data). Let’s go ahead and do that:

(data.frame(summary = c('mean', 'SD', 'SE'),
           value = c(round(mean(tree_diameter),1), 
                     round(sd(tree_diameter),1), 
                     round(sd(tree_diameter)/sqrt(length(tree_diameter)),1))))
##   summary value
## 1    mean  56.3
## 2      SD  13.6
## 3      SE   4.3

What if I asked you to use a model to estimate the mean and the standard error of a dataset - how could you do this?

Although you might not think of it this way, assuming normally distributed random error, we could fit a linear model to our data and use the intercept estimates to describe the mean and standard error of our dataset. Remember, the intercept of a linear model is the mean (or expected) value of our variable when our predictor(s) = 0, and with no predictors then we just have the mean of the data.

summary(lm(tree_diameter ~ 1))
## 
## Call:
## lm(formula = tree_diameter ~ 1)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -14.30 -10.05  -1.80   5.45  28.70 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     56.3        4.3   13.09 3.65e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.6 on 9 degrees of freedom

The Model of the Mean

While finding these values is routine and simple, we’ll over complicate it to illustrate a few key parts of Bayesian analysis. We’ll now replicate this analysis within nimble.

The Model of the Mean

Steps to our analysis:

  1. Load the nimble package
  2. Write the model and pass to nimble using nimbleCode
  3. Bundle Data and Constants
  4. Bundle initial values
  5. Specify MCMC settings
  6. Do NIIMBLE (see frog slides)

The Model of the Mean

What does a model look like for this?

The Model of the Mean

library(nimble)
## nimble version 1.0.1 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit https://R-nimble.org.
## 
## Note for advanced users who have written their own MCMC samplers:
##   As of version 0.13.0, NIMBLE's protocol for handling posterior
##   predictive nodes has changed in a way that could affect user-defined
##   samplers in some situations. Please see Section 15.5.1 of the User Manual.
## 
## Attaching package: 'nimble'
## The following object is masked from 'package:stats':
## 
##     simulate
tree_model01 <- nimbleCode({
  
  
  # Note: historically had to specify "precision" in BUGS/JAGS
  # precision = 1/pop variance
  # pop variance = pop sd*pop sd
  # In Nimble, we can just specify SD, but need to be explicit in likelihood distribution
  
  # Priors
  pop.mean ~ dnorm(53, sd = 5) # need the "sd =" or have to provide precision (1/(5*5) = 0.04)
  
  # need to specify a prior for the standard deviation of this sample.
  # we could approach it several ways, but first we'll use the SD of the actual observations (we'll check out other options in subsequent models)
  pop.sd <- tree_sd # we'll pass `tree_sd` in as data
  
  # likelihood
  for(i in 1:nObs){
    tree[i] ~ dnorm(pop.mean, sd = pop.sd) 
  }

})

bundle data and constants for NIMBLE

tree_data <- list(tree = tree_diameter,
                  tree_sd = sd(tree_diameter))
tree_constants <- list(nObs = length(tree_diameter))

bundle initial values for NIMBLE

inits <- list(pop.mean = rnorm(n = 1, 53, 5))

Set up MCMC settings

# things we want `NIMBLE` to keep track of:
# (very useful with complexity)
keepers <- c('pop.mean', 'pop.sd') # do we really need to monitor pop.sd?

# MCMC settings
nc = 3 # why chains
nb = 1000 # Why burn-ins
ni = nb + 2000 # why iterations
nt = 1 # why thinning

Package it all up and get samples

# one call
samples <- nimbleMCMC(
    code = tree_model01,
    constants = tree_constants,
    data = tree_data,
    inits = inits,
    monitors = keepers,
    niter = ni,
    nburnin = nb,
    thin = nt,
    nchains = nc,
    summary = T) # get jags-style summary of posterior
## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
##   [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## Checking model calculations
## Compiling
##   [Note] This may take a minute.
##   [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 2...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 3...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

summarize samples

# function to summarize samples
getValues <- function(x){
  mean <- mean(x)
  sd <- sd(x)
  quants <- quantile(x, probs = c(0.025, 0.5, 0.975))
  out <- matrix(data = c(mean, sd, quants), nrow = 1, dimnames = list(1,c('mean',
                                                                          'sd',
                                                                          'lower95',
                                                                          'median',
                                                                          'upper95')))
  return(out)
}
# .......................................................................
# .......................................................................

# First, "Summary" gives us some simple stuff
samples$summary$all.chains
##              Mean   Median  St.Dev. 95%CI_low 95%CI_upp
## pop.mean 54.82732 54.77144 3.286314  48.41007  61.43406
## pop.sd   13.59779 13.59779 0.000000  13.59779  13.59779
# this is important - let's talk through it.
# prior knowledge said 43-63
# we updated prior knowledge with new knowledge and reduced 95% CI. 
# pop sd is exactly the same value as we passed it - does not get sampled - just a constant


# (HOW DOES 95% CRED INT differ from 95% CONF INT?)
# but also,

# .......................................................................
# INSPECT RESULTS
# .......................................................................

# convert to mcmc object for inspection via coda package
samples_mcmc <- coda::as.mcmc.list(lapply(samples$samples, coda::mcmc))

# Look at traceplots of the parameters
par(mfrow=c(1,2))
coda::traceplot(samples_mcmc[, 1:2])

# calculate Rhat convergence diagnostic of parameters
# "Gelman-Rubin Statitsic" - compares ratio of the variation of the samples within a chain and the variation of samples when the chains are pooled; variation within and between chains should stabilize with convergence (i.e., go to 1)
# rule of thumb is anything <1.1
coda::gelman.diag(samples_mcmc[,1]) # just look at pop.mean = all good
## Potential scale reduction factors:
## 
##      Point est. Upper C.I.
## [1,]          1          1
# extract mean and SD

samplesdf <- do.call(rbind, samples_mcmc)
pop.mean <- samplesdf[, 1]
pop.sd <- samplesdf[, 2]
getValues(pop.mean)
##       mean       sd  lower95   median  upper95
## 1 54.82732 3.286314 48.41007 54.77144 61.43406
getValues(pop.sd)
##       mean sd  lower95   median  upper95
## 1 13.59779  0 13.59779 13.59779 13.59779
# MCMCplots is nice too
library(mcmcplots)
## Loading required package: coda
mcmcplot(samples$samples, dir = here::here('Modules/02_Intro_Bayes/files_for_slides'), filename = "tree_model01")
## 
                                                                                
Preparing plots for pop.mean.  50% complete.
## 
                                                                                
Preparing plots for pop.sd.  100% complete.
                                                                                

MCMCplots Results here

The Model of the Mean

How much of an influence did our priors have on our estimate of the population mean?

What if we just plead complete ignorance? Like, non-ecologist knowledge of tree diameters.

tree_model02 <- nimbleCode({
  
  ## Priors ##
  pop.mean ~ dunif(0,200) # the population mean has an equal probability of being any number between 0 and 200. 200 is arbitrary - what if we put 500? 1000?

  pop.sd ~ dunif(0, 100) # the pop.sd has an equal probability of being any number between 0 and 100
  
  # NOTE: before nimble, had to do something like this:
  # pop.sd ~ dunif(0, 100)
  # pop.var <- pop.sd * pop.sd
  # pop.prec <- 1/pop.var # pass pop.prec to dnorm below
  
  # likelihood
  for(i in 1:nObs){
    tree[i] ~ dnorm(pop.mean, sd = pop.sd) 
  }

})

Specify Initial Values

Since we changed prior distributions, need to modify the starting values that we are passing to nimble. Also need to pass starting value for pop.sd since we are no longer passing it in as data.

We are pretty safe with almost any starting value because of the super flat priors we specified - would be hard to mess up.

Below I will just pull random numbers from identical distributions to the priors

inits <- list(pop.mean = runif(n = 1, min = 0, max = 200),
              pop.sd = runif(n = 1, min = 0, max = 100))

Bundle data and constants for NIMBLE

Note that we are no longer passing tree_sd in as data - going to pull from priors.

tree_data <- list(tree = tree_diameter)
tree_constants <- list(nObs = length(tree_diameter))

Set up MCMC settings

# things we want `NIMBLE` to keep track of:
# (very useful with complexity)
keepers <- c('pop.mean', 'pop.sd')

# MCMC settings
nc = 3
nb = 1000
ni = nb + 2000
nt = 1

Package it all up and get samples

# one call
samples02 <- nimbleMCMC(
    code = tree_model02, # changed model name
    constants = tree_constants,
    data = tree_data,
    inits = inits,
    monitors = keepers,
    niter = ni,
    nburnin = nb,
    thin = nt,
    nchains = nc,
    summary = T) # get jags-style summary of posterior
## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
##   [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## Checking model calculations
## Compiling
##   [Note] This may take a minute.
##   [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 2...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 3...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

What just happened?

# First, "Summary" gives us some simple stuff
samples02$summary$all.chains
##              Mean   Median  St.Dev. 95%CI_low 95%CI_upp
## pop.mean 56.26674 56.23745 5.308168  45.83599  67.05120
## pop.sd   15.76228 14.76271 4.458699   9.64558  26.98514
# .......................................................................
# INSPECT RESULTS
# .......................................................................

# convert to mcmc object for inspection via coda package
samples_mcmc <- coda::as.mcmc.list(lapply(samples02$samples, coda::mcmc))

# Look at traceplots of the parameters
par(mfrow=c(1,2))
coda::traceplot(samples_mcmc[, 1:2])

# calculate Rhat convergence diagnostic of parameters
# "Gelman-Rubin Statitsic" - compares ratio of the variation of the samples within a chain and the variation of samples when the chains are pooled; variation within and between chains should stabilize with convergence (i.e., go to 1)
# rule of thumb is anything <1.1
coda::gelman.diag(samples_mcmc) # we can now look at both mean and sd
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## pop.mean       1.01       1.03
## pop.sd         1.01       1.03
## 
## Multivariate psrf
## 
## 1.01
# extract mean and SD lambda of each grid cell

samplesdf <- do.call(rbind, samples_mcmc)
pop.mean <- samplesdf[, 1]
pop.sd <- samplesdf[, 2]
getValues(pop.mean)
##       mean       sd  lower95   median upper95
## 1 56.26674 5.308168 45.83599 56.23745 67.0512
getValues(pop.sd)
##       mean       sd lower95   median  upper95
## 1 15.76228 4.458699 9.64558 14.76271 26.98514
# MCMCplots is nice too
library(mcmcplots)
mcmcplot(samples02$samples, dir = here::here('Modules/02_Intro_Bayes/files_for_slides'), filename = "tree_model02")
## 
                                                                                
Preparing plots for pop.mean.  50% complete.
## 
                                                                                
Preparing plots for pop.sd.  100% complete.

MCMCplots Results here

What just happened?

This is a good look at typical output from a Bayesian model. A bit messy, each chain is slightly different. - You should note increased uncertainty in our distributions. - Good news is that even pleading ignorance, our 95% CI of the mean overlaps truth. - Just to show that the “flatness” of the prior shouldn’t change things too much - flat is flat.

tree_model03 <- nimbleCode({
  
  ## Priors ##
  pop.mean ~ dunif(0,1000) # the population mean has an equal probability of being any number between 0 and 200. 200 is arbitrary - what if we put 500? 1000?

  pop.sd ~ dunif(0, 500) # the pop.sd has an equal probability of being any number between 0 and 100
  
  # NOTE: before nimble, had to do something like this:
  # pop.sd ~ dunif(0, 100)
  # pop.var <- pop.sd * pop.sd
  # pop.prec <- 1/pop.var # pass pop.prec to dnorm below
  
  # likelihood
  for(i in 1:nObs){
    tree[i] ~ dnorm(pop.mean, sd = pop.sd) 
  }

})

# don't really need to change initial values - they will fall within acceptable values and will still be radndom
inits <- list(pop.mean = runif(n = 1, min = 0, max = 200),
              pop.sd = runif(n = 1, min = 0, max = 100))

# data and constants
tree_data <- list(tree = tree_diameter)
tree_constants <- list(nObs = length(tree_diameter))


# params to monitor
keepers <- c('pop.mean', 'pop.sd')

# MCMC settings
nc = 3
nb = 1000
ni = nb + 2000
nt = 1


# one call
samples03 <- nimbleMCMC(
    code = tree_model03, # changed model name
    constants = tree_constants,
    data = tree_data,
    inits = inits,
    monitors = keepers,
    niter = ni,
    nburnin = nb,
    thin = nt,
    nchains = nc,
    summary = T) # get jags-style summary of posterior
## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
##   [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## Checking model calculations
## Compiling
##   [Note] This may take a minute.
##   [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 2...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 3...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
# First, "Summary" gives us some simple stuff
samples03$summary$all.chains
##              Mean   Median  St.Dev. 95%CI_low 95%CI_upp
## pop.mean 56.34326 56.38719 4.993508 45.934104  66.23678
## pop.sd   16.03778 15.22397 4.639292  9.649754  27.53042
# this is important - let's talk through it.
# prior knowledge said 43-63
# we updated prior knowledge with new knowledge and reduced 95% CI. 
# pop sd is exactly the same value as we passed it - does not get sampled - just a constant


# (HOW DOES 95% CRED INT differ from 95% CONF INT?)
# but also,

# .......................................................................
# INSPECT RESULTS
# .......................................................................

# convert to mcmc object for inspection via coda package
samples_mcmc <- coda::as.mcmc.list(lapply(samples03$samples, coda::mcmc))

# Look at traceplots of the parameters
par(mfrow=c(1,2))
coda::traceplot(samples_mcmc[, 1:2])

# calculate Rhat convergence diagnostic of parameters
# "Gelman-Rubin Statitsic" - compares ratio of the variation of the samples within a chain and the variation of samples when the chains are pooled; variation within and between chains should stabilize with convergence (i.e., go to 1)
# rule of thumb is anything <1.1
coda::gelman.diag(samples_mcmc) # we can now look at both mean and sd
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## pop.mean       1.00       1.01
## pop.sd         1.01       1.02
## 
## Multivariate psrf
## 
## 1
# extract mean and SD lambda of each grid cell

samplesdf <- do.call(rbind, samples_mcmc)
pop.mean <- samplesdf[, 1]
pop.sd <- samplesdf[, 2]
getValues(pop.mean)
##       mean       sd lower95   median  upper95
## 1 56.34326 4.993508 45.9341 56.38719 66.23678
getValues(pop.sd)
##       mean       sd  lower95   median  upper95
## 1 16.03778 4.639292 9.649754 15.22397 27.53042
# MCMCplots is nice too
library(mcmcplots)
mcmcplot(samples03$samples, dir = here::here('Modules/02_Intro_Bayes/files_for_slides'), filename = "tree_model03")
## 
                                                                                
Preparing plots for pop.mean.  50% complete.
## 
                                                                                
Preparing plots for pop.sd.  100% complete.

MCMCplots Results here

What just happened?

# model
tree_model04 <- nimbleCode({
  
  ## Priors ##
  pop.mean ~ dnorm(0, sd = 100) # still flat 'uninformative', but draws from a random normal
  
  # to set up the standard deviation, we need to make sure that the values are not negative.
  # If you remember, a gamma distribution has support from 0 -> Inf. This is a good choice.
  # we'll set the gamma on the precision term and then transform that to something more natural to us - the std. dev. 
  prec ~ dgamma(0.1, 0.1)
  pop.sd <- 1/sqrt(prec)
  

  # likelihood
  for(i in 1:nObs){
    tree[i] ~ dnorm(pop.mean, sd = pop.sd) 
  }

})

# inits
inits <- list(pop.mean = rnorm(n = 1, 100, 10), # now rnorm
              prec = rgamma(n = 1, 0.1,0.1))

# gather data and constants
tree_data <- list(tree = tree_diameter)
tree_constants <- list(nObs = length(tree_diameter))

# parameters to monitor
keepers <- c('pop.mean', 'pop.sd')

# MCMC settings
nc = 3
nb = 5000
ni = nb + 5000
nt = 1

# get samples
samples04 <- nimbleMCMC(
    code = tree_model04, # changed model name
    constants = tree_constants,
    data = tree_data,
    inits = inits,
    monitors = keepers,
    niter = ni,
    nburnin = nb,
    thin = nt,
    nchains = nc,
    summary = T)
## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
##   [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## Checking model calculations
## Compiling
##   [Note] This may take a minute.
##   [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 2...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 3...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
# check it out
samples04$summary$all.chains
##              Mean   Median  St.Dev. 95%CI_low 95%CI_upp
## pop.mean 56.20605 56.20693 4.713230 46.725845  65.62689
## pop.sd   14.58666 13.94124 3.764879  9.207031  24.14396
# convert to mcmc object for inspection via coda package
samples_mcmc <- coda::as.mcmc.list(lapply(samples04$samples, coda::mcmc))

# Look at traceplots of the parameters
par(mfrow=c(1,2))
coda::traceplot(samples_mcmc[, 1:2])

# calculate Rhat convergence diagnostic of parameters
# "Gelman-Rubin Statitsic" - compares ratio of the variation of the samples within a chain and the variation of samples when the chains are pooled; variation within and between chains should stabilize with convergence (i.e., go to 1)
# rule of thumb is anything <1.1
coda::gelman.diag(samples_mcmc) # we can now look at both mean and sd
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## pop.mean          1          1
## pop.sd            1          1
## 
## Multivariate psrf
## 
## 1
library(mcmcplots)
mcmcplot(samples04$samples, dir = here::here('Modules/02_Intro_Bayes/files_for_slides'), filename = "tree_model04")
## 
                                                                                
Preparing plots for pop.mean.  50% complete.
## 
                                                                                
Preparing plots for pop.sd.  100% complete.

MCMCplots Results here

What just happened?

Your turn! Estimating a Population Mean - Male Peregrine Falcons

White (1968) provides information of the weight of 12 adult male peregrines.

peregrines <- c(616, 653, 658, 608, 575, 621, 583, 602, 581, 604, 584, 604)